Concept of a Rule-based Configurator for Auto-WEKA Using OpenML
نویسندگان
چکیده
Despite a large amount of research devoted to improving meta-learning techniques, providing and using background knowledge for this task remains a challenge. In this paper we propose a mechanism for automatic recommendation of suitable machine learning algorithms and their parameters. We used OpenML database and use rule-based configurator to improve Auto-WEKA tool. This paper discusses the concept of our approach and the prototype tool based on the HEARTDROID rule engine being developed. Introduction The objective of our work is to build a meta-learning recommendation system that guides a user through the process of solving a machine learning task. We use the data from OpenML’s experiments to build a meta-knowledge which is later encoded with rules. This knowledge is then used for matching new dataset’s meta-attributes with current meta-knowledge to obtain a set of possibly best algorithms. Finally, we use Auto-WEKA for optimizing the parameters of this narrowed set of algorithms. In our approach we follow the general meta-learning architecture previously proposed by Pavel Brazdil et.al. [1]. We use data about machine learning from on-line collaborative platform known as OpenML1. In the creation of meta-knowledge we use the Amelia-II algorithm for imputation of missing data which could not be obtained with OpenML [2]. In rule-based configurator we take advantage of HEARTDROID inference engine2. Auto-WEKA does hyper-parameter optimization which we use for additional tuning of created recommendation [3]. We distinguish three phases in the recommendation mechanism: 1) knowledge acquisition, 2) recommendation, and 3) tuning. During the 1st phase meta-knowledge is built from OpenML’s data only. In 2nd one the system uses that meta-knowledge and a new dataset to build a set of suitable algorithms. Finally an automatic configuration of these algorithms is performed with an usage of Auto-WEKA. Building meta-knowledge In the acquisition phase main goal is to build meta-knowledge that describes dependecies between datasets and performance of machine learning algorithms executed on them. For every dataset in the OpenML database, a set of metaattributes is available that includes: statistical information (e.g. number of classes and features, kurtosis of numeric attributes), information-theoretic characteristics (e.g. class 1 http://www.openml.org/ 2 http://bitbucket.org/sbobek/heartdroid
منابع مشابه
Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA
WEKA is a widely used, open-source machine learning platform. Due to its intuitive interface, it is particularly popular with novice users. However, such users often find it hard to identify the best approach for their particular dataset among the many available. We describe the new version of Auto-WEKA, a system designed to help such users by automatically searching through the joint space of ...
متن کاملDeveloping a Matrix Based Sales Configurator for Modular Product
For several years, the structuring approaches for modular product families have been developed in industry. The modularization leads often to the use of configurator, which is a computer application used to manage the relations of modules, connections and rules between different customer segments. Configurator brings benefits to the whole delivery process, by removing the information gaps from ...
متن کاملA Recommendation Based Framework for Online Product Configuration
Adopting a mass customization strategy, enterprises often enable customers to specify their individual product wishes by using web based configurator tools. With such tools, customers can interactively and virtually create their own instance of a product. However, customers are not usually supported in a comprehensive way during the configuration process, thus facing problems such as complexity...
متن کاملA New Fuzzy Sliding Mode Controller with Auto-Adjustable Saturation Boundary Layers Implemented on Vehicle Suspension
This study develops a fuzzy sliding mode controller (FSMC) based on a variable boundary layer. A fuzzy inference mechanism is used to on-line tune the thickness of the boundary layers of the controller. Minimum rule base has been used for the fuzzy inference system which results in low calculation effort. The aim of this paper is to design a controller which will eliminate the chattering of FSM...
متن کاملClassification Using the Compact Rule Generation
Various attributes within a dataset relate to each other and with the class attribute. The relationship between the different attributes with class attribute may improve the classification accuracy. The paper introduces CCSA algorithm that performs the clustering that is cascaded by classification based on association. The Clustering process generates a group of various instances within the dat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015